How to Calculate the Five-Number Summary
Understanding quartiles and the five-number summary is essential for describing a dataset's distribution. Quartiles divide the data into four equal parts, providing insight into variability and central tendency. The five-number summary builds on this concept by identifying key values that summarize the dataset’s spread. In this lesson, we will explore how to compute quartiles and the five-number summary through examples.
Quartiles
What Are Quartiles?
Quartiles split an ordered dataset into four equal parts, summarizing the data's distribution. The three quartiles, \( Q_1 \), \( Q_2 \), and \( Q_3 \), divide the dataset into these sections.
- First Quartile (\( Q_1 \)): The 25th percentile of the data, meaning 25% of the values fall below this point.
- Second Quartile (\( Q_2 \)): The 50th percentile, also known as the median. Half of the values are below this point, and half are above.
- Third Quartile (\( Q_3 \)): The 75th percentile of the data, meaning 75% of the values fall below this point.
Example 1
Scientists recorded the diameters (in kilometers) of several meteorite impact craters on Earth. Using the dataset below, determine the first quartile (\( Q_1 \)) and third quartile (\( Q_3 \)).
Crater Diameters (km) | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1.2 | 3.5 | 2.8 | 4.0 | 6.7 | 5.2 | 8.1 | 3.1 | 4.5 | 2.9 | 6.0 | 7.4 |
Solution
First, we arrange the data into ascending order \[
1.2 \quad 2.8 \quad 2.9 \quad 3.1
\quad 3.5 \quad 4.0 \quad 4.5 \quad 5.2 \quad 6.0 \quad 6.7 \quad 7.4 \quad 8.1
\] To
find \(Q_1\) and \(Q_3\), we must first find the median
\[\text{{median}}=\dfrac{4.0+4.5}{{2}}=4.25.\] This splits the data into two halves:
- the lower 50%: \(1.2 \quad 2.8 \quad 2.9 \quad 3.1 \quad 3.5 \quad 4.0\)
- the upper 50%: \(4.5 \quad 5.2 \quad 6.0 \quad 6.7 \quad 7.4 \quad 8.1\)
\(Q_1\) is the median of the lower 50% of the data: \[Q_1=\dfrac{2.9+3.1}{{2}}=3.0\] \(Q_3\) is the median of the upper 50% of the data: \[Q_3=\dfrac{6.0+6.7}{{2}}=6.35\]
$$\tag*{\(\blacksquare\)}$$
Example 2
A group of professional esports players was tested for their reaction times in milliseconds. Given the dataset below, use the Summary Statistics Calculator to compute the first quartile (\( Q_1 \)) and the third quartile (\( Q_3 \)).
Reaction Times (ms) | |||||||||
---|---|---|---|---|---|---|---|---|---|
175 | 180 | 185 | 189 | 190 | 195 | 195 | 198 | 200 | 202 |
205 | 210 | 210 | 215 | 215 | 220 | 225 | 230 | 235 | 240 |
245 | 250 | 250 | 255 | 260 | 265 | 270 | 275 | 280 | 290 |
Solution
Copy the data, open the Summary Statistics Calculator, and paste the data into the spreadsheet, and close the spreadsheet. Check the boxes for \(Q_1\) and \(Q_3\).
Therefore, we have that \(Q_1=198\) and \(Q_3=250\).
$$\tag*{\(\blacksquare\)}$$
The Five-Number Summary
What is the Five-Number Summary?
The five-number summary provides a quick numerical description of a data set by identifying key values that describe its distribution.
It consists of:
- The minimum value
- The first quartile (\( Q_1 \))
- The median (\( Q_2 \))
- The third quartile (\( Q_3 \))
- The maximum value
Example 3
Theme park engineers recorded the top speeds (in mph) of 30 different roller coasters worldwide. Using the dataset below, compute the five-number summary (minimum, \( Q_1 \), median, \( Q_3 \), maximum).
Roller Coaster Speeds (mph) | |||||||||
---|---|---|---|---|---|---|---|---|---|
44.7 | 60.9 | 65.2 | 52.8 | 74.6 | 68.4 | 59.0 | 63.3 | 55.3 | 80.8 |
88.2 | 49.7 | 77.7 | 85.7 | 55.9 | 61.5 | 78.9 | 93.2 | 67.1 | 57.8 |
62.1 | 71.3 | 54.0 | 83.9 | 90.2 | 50.9 | 99.4 | 96.3 | 86.8 | 46.6 |
Solution
Copy the data to the clipboard, open the Summary Statistics Calculator, and paste the data into the spreadsheet. Close the spreadsheet, and click on the Minimum, \(Q_1\), Median, \(Q_3\), and Maximum checkboxes.
Therefore, the five number summary is \[\text{{Minimum}}=44.7\quad Q_1=55.9\quad \text{{Median}}=66.15\quad Q_3=83.9\quad\text{{Maximum}}=99.4\]
$$\tag*{\(\blacksquare\)}$$
Conclusion
Quartiles and the five-number summary are fundamental tools for summarizing a dataset’s distribution. Understanding how to compute them allows us to measure data spread and identify potential outliers. In future lessons, we will explore box plots and interquartile range (IQR) as ways to visualize and further analyze data distributions.